fix(claude): honor CLAUDE_BIN_PATH in dev mode for libc-mismatch hosts#1481
fix(claude): honor CLAUDE_BIN_PATH in dev mode for libc-mismatch hosts#1481
Conversation
📝 WalkthroughWalkthroughChecks Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant Resolver
participant Env as Environment
participant FS as FileSystem
participant Config as Config/Bundled
Caller->>Resolver: resolveClaudeBinaryPath(config?)
Resolver->>Env: read CLAUDE_BIN_PATH
Env-->>Resolver: envPath (set / unset / empty)
alt envPath set and non-empty
Resolver->>FS: fileExists(envPath)
FS-->>Resolver: true
Resolver-->>Caller: return envPath
else file missing
FS-->>Resolver: false
Resolver-->>Caller: throw "CLAUDE_BIN_PATH is set but the file does not exist"
else envPath unset or empty
Resolver->>Resolver: if !BUNDLED_IS_BINARY -> return undefined (dev)
alt BUNDLED_IS_BINARY true
Resolver->>Config: check configClaudeBinaryPath
Config-->>Resolver: configPath (or none)
alt configPath set
Resolver->>FS: fileExists(configPath)
FS-->>Resolver: true/false
Resolver-->>Caller: return configPath or throw
else
Resolver->>Config: autodetect bundled binary
Config-->>Resolver: detectedPath or none
Resolver-->>Caller: return detectedPath or throw install instructions
end
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 6/8 reviews remaining, refill in 8 minutes and 2 seconds.Comment |
Review SummaryVerdict: minor-fixes-needed Your fix correctly hoists Blocking issues
Suggested fixes
Minor / nice-to-have
Compliments
Reviewed via maintainer-review-pr workflow (Pi/Minimax). Aspects run: code-review, error-handling, test-coverage, comment-quality, docs-impact. |
- CHANGELOG entry under [Unreleased] / Fixed describing the dev-mode CLAUDE_BIN_PATH escape hatch (previously ignored). Notes that config-file path remains binary-mode-only and that env-loading + target-repo .env isolation are unchanged downstream. - Empty-string test pinning that CLAUDE_BIN_PATH='' falls through to undefined rather than throwing — protects against a future predicate typo that would treat empty as "set". - One-line note in ai-assistants.md "Binary path configuration" section pointing dev-mode users at the env-var override for the glibc/musl mismatch case. Skipped from the review: - The other two docs-page rewrites (configuration.md / troubleshooting.md): the error message itself names CLAUDE_BIN_PATH, and #1474 documents the use case publicly. One mention in ai-assistants.md is enough for discovery. - Type-style consistency tweaks in the test file: pure bikeshed.
The Claude Agent SDK auto-resolves its bundled native binary in [linux-x64-musl, linux-x64] order. On glibc Linux hosts (Ubuntu/Debian/ Fedora), Bun installs both via optionalDependencies and the musl variant is picked first; its ELF interpreter (/lib/ld-musl-x86_64.so.1) does not exist on glibc, so spawn fails and the SDK reports a misleading "binary not found" — the file is on disk, the loader is not. The documented escape hatch CLAUDE_BIN_PATH was dead code in dev mode: the resolver early-returned undefined when BUNDLED_IS_BINARY=false before ever reading the env var. The only workaround was patching node_modules. Move the env-var block above the BUNDLED_IS_BINARY return. Config-file path stays binary-mode-only — it's per-repo, not per-machine; env is the right knob for libc mismatches. Behavior preserved: - env unset → unchanged (undefined in dev, autodetect/throw in binary) - env set + file exists → resolved (was binary-only; now also dev) - env set + file missing → clear error (was binary-only; now also dev) Closes #1474
- CHANGELOG entry under [Unreleased] / Fixed describing the dev-mode CLAUDE_BIN_PATH escape hatch (previously ignored). Notes that config-file path remains binary-mode-only and that env-loading + target-repo .env isolation are unchanged downstream. - Empty-string test pinning that CLAUDE_BIN_PATH='' falls through to undefined rather than throwing — protects against a future predicate typo that would treat empty as "set". - One-line note in ai-assistants.md "Binary path configuration" section pointing dev-mode users at the env-var override for the glibc/musl mismatch case. Skipped from the review: - The other two docs-page rewrites (configuration.md / troubleshooting.md): the error message itself names CLAUDE_BIN_PATH, and #1474 documents the use case publicly. One mention in ai-assistants.md is enough for discovery. - Type-style consistency tweaks in the test file: pure bikeshed.
6e67d8a to
8188492
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (2)
packages/providers/src/claude/binary-resolver-dev.test.ts (1)
23-39: Prefer per-test spy teardown (afterEachor test-localtry/finally) to avoid leak-on-failure.Current cleanup is good, but moving spy restore to per-test teardown makes isolation more deterministic if a test exits early.
Refactor sketch
-import { describe, test, expect, mock, beforeEach, afterAll, spyOn } from 'bun:test'; +import { describe, test, expect, mock, beforeEach, afterEach, afterAll, spyOn } from 'bun:test'; @@ beforeEach(() => { delete process.env.CLAUDE_BIN_PATH; - fileExistsSpy?.mockRestore(); - fileExistsSpy = undefined; }); + + afterEach(() => { + fileExistsSpy?.mockRestore(); + fileExistsSpy = undefined; + });Also applies to: 51-74
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/providers/src/claude/binary-resolver-dev.test.ts` around lines 23 - 39, The tests currently restore the fileExistsSpy only in beforeEach and afterAll which can leak if a test fails early; change cleanup to per-test teardown by moving fileExistsSpy?.mockRestore() into an afterEach (or wrap each test body with try/finally to restore) so each test restores fileExistsSpy immediately after it runs; update the same pattern in the other block referenced (lines 51-74) so both occurrences of the fileExistsSpy mock are restored in afterEach (or via test-local try/finally) and set fileExistsSpy = undefined in that afterEach as well.packages/providers/src/claude/binary-resolver.ts (1)
5-12: Consider consolidating the glibc/musl rationale to one concise comment.The same platform-mismatch explanation is duplicated in two comment blocks; keeping one short canonical note will reduce documentation drift.
Also applies to: 71-73
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/providers/src/claude/binary-resolver.ts` around lines 5 - 12, Consolidate the duplicated glibc/musl platform-mismatch rationale into a single concise comment near the resolution order for pathToClaudeCodeExecutable: remove the repeated paragraph (the one also present around lines 71-73) and replace with one short canonical note that mentions the platform mismatch concern and that CLAUDE_BIN_PATH can override the auto-resolved per-platform binary; update the comment around the resolution list (referencing CLAUDE_BIN_PATH and pathToClaudeCodeExecutable) so it is the single source of truth.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@packages/providers/src/claude/binary-resolver-dev.test.ts`:
- Around line 23-39: The tests currently restore the fileExistsSpy only in
beforeEach and afterAll which can leak if a test fails early; change cleanup to
per-test teardown by moving fileExistsSpy?.mockRestore() into an afterEach (or
wrap each test body with try/finally to restore) so each test restores
fileExistsSpy immediately after it runs; update the same pattern in the other
block referenced (lines 51-74) so both occurrences of the fileExistsSpy mock are
restored in afterEach (or via test-local try/finally) and set fileExistsSpy =
undefined in that afterEach as well.
In `@packages/providers/src/claude/binary-resolver.ts`:
- Around line 5-12: Consolidate the duplicated glibc/musl platform-mismatch
rationale into a single concise comment near the resolution order for
pathToClaudeCodeExecutable: remove the repeated paragraph (the one also present
around lines 71-73) and replace with one short canonical note that mentions the
platform mismatch concern and that CLAUDE_BIN_PATH can override the
auto-resolved per-platform binary; update the comment around the resolution list
(referencing CLAUDE_BIN_PATH and pathToClaudeCodeExecutable) so it is the single
source of truth.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a4d22e1e-d5bc-4052-942a-8a1cbcb13953
📒 Files selected for processing (4)
CHANGELOG.mdpackages/docs-web/src/content/docs/getting-started/ai-assistants.mdpackages/providers/src/claude/binary-resolver-dev.test.tspackages/providers/src/claude/binary-resolver.ts
✅ Files skipped from review due to trivial changes (1)
- packages/docs-web/src/content/docs/getting-started/ai-assistants.md
🚧 Files skipped from review as they are similar to previous changes (1)
- CHANGELOG.md
Regression from #1481 (honor CLAUDE_BIN_PATH in dev mode). The CI workflow set `CLAUDE_BIN_PATH: ~/.local/bin/claude` in YAML `env:` blocks; YAML does not expand `~`, so the literal string was passed to the resolver. Before #1481, dev mode silently ignored the env var and the SDK auto-resolved its bundled binary — so the broken value was harmless. After #1481, dev mode honors it, the file-existence check fails on the literal `~`, and the smoke job aborts with "CLAUDE_BIN_PATH is set ... but the file does not exist". Move the env-var assignment into the run-step shell where `$HOME` resolves. Both e2e-claude and e2e-mixed-providers jobs are affected.
* fix(core/test): split connection.test.ts from DB-test batch to avoid mock pollution (coleam00#1269) messages.test.ts uses mock.module('./connection', ...) at module-load time. Per CLAUDE.md:131 (Bun issue oven-sh/bun#7823), mock.module() is process- global and irreversible. When Bun pre-loads all test files in a batch, the mock shadows the real connection module before connection.test.ts runs, causing getDatabaseType() to always return the mocked value regardless of DATABASE_URL. Move connection.test.ts into its own `bun test` invocation immediately after postgres.test.ts (which runs alone) and before the big DB/utils/ config/state batch that contains messages.test.ts. This follows the same isolation pattern already used for command-handler, clone, postgres, and path-validation tests. * fix(setup): align PORT default on 3090 across .env.example, wizard, and JSDoc (coleam00#1152) (coleam00#1271) The server's getPort() fallback changed from 3000 to 3090 in the Hono migration (coleam00#318), but .env.example, the setup wizard's generated .env, and the JSDoc describing the fallback were not updated — leaving three different sources of truth for "the default PORT." When the wizard writes PORT=3000 to ~/.archon/.env (which the Hono server loads with override: true, while Vite only reads repo-local .env), the two processes can land on different ports silently. That mismatch is the real mechanism behind the failure described in coleam00#1152. - .env.example: comment out PORT, document 3090 as the default - packages/cli/src/commands/setup.ts: wizard no longer writes PORT=3000 into the generated .env; fix the "Additional Options" note - packages/cli/src/commands/setup.test.ts: assert no bare PORT= line and the commented default is present - packages/core/src/utils/port-allocation.ts: fix stale JSDoc "default 3000" -> "default 3090" - deploy/.env.example: keep Docker default at 3000 (compose/Caddy target that) but annotate it so users don't copy it for local dev Single source of truth for the local-dev default is now basePort in port-allocation.ts. * fix(providers/claude): use || instead of ?? in hasExplicitTokens to handle empty-string env vars (coleam00#1028) Closes coleam00#1027 * chore(deps): bump claude-agent-sdk to 0.2.121, codex-sdk to 0.125.0 (coleam00#1460) Both SDKs were ~30 patch releases behind. Validation suite passes (type-check, lint, format, tests across all 10 packages) without code changes. The only sustained Claude SDK behavior change in the range — v0.2.111's options.env overlay/replace flap, since reverted to overlay — is a no-op for Archon, which already passes { ...process.env } as the SDK env. * fix(cli): lazy-import bundled skill files so non-setup commands don't crash on missing source (coleam00#1394) The 18 top-level `import … with { type: 'text' }` statements in `bundled-skill.ts` resolve at module load. For `bun build --compile` that's build time, so the binary embeds the strings and works regardless of any on-disk skill files. For `bun link` (linked-source) installs that's every `archon` invocation — including `archon --help`, which doesn't even use the skill content. If any of the 18 source files are missing or moved, the import fails and the CLI cannot start at all. The skill content is data the binary deploys via `archon setup`, not data the CLI needs at runtime. There's only one consumer in production code: `copyArchonSkill()` in `setup.ts`. Moving the import into that function as a dynamic import preserves the compiled-binary behavior (Bun's bundler statically analyses literal-string `import()` and embeds the chunk — verified by grepping the SKILL.md frontmatter out of a freshly compiled binary) while making the linked-source install resilient: only `archon setup` triggers the bundled-skill module load now. Verified: a known skill string appears in the compiled binary 1×, and `archon --help` no longer needs the source files to start. `copyArchonSkill()` becomes async because the dynamic import is a Promise. The single production call site is already in an async function and gets an `await`. The four `setup.test.ts` cases become async too. * fix(claude): stop passing --no-env-file to native binary in dev mode (coleam00#1461) * fix(claude): stop passing --no-env-file to native binary in dev mode The Claude Agent SDK switched from shipping `cli.js` inside the package to per-platform native binaries via optional deps somewhere in the 0.2.x series. As of 0.2.121 there is no `cli.js` in the SDK package; dev mode resolves to `@anthropic-ai/claude-agent-sdk-darwin-arm64/claude` (Mach-O). That native binary rejects `--no-env-file` with `error: unknown option '--no-env-file'` and the subprocess exits 1. `shouldPassNoEnvFile` was returning true on `cliPath === undefined` on the assumption that "dev mode = JS executable run via Bun". That assumption is dead. Tighten the predicate to only return true on an explicit `.js` suffix, so we only emit the flag when the SDK is going to spawn a Bun-runnable script. CWD `.env` leak protection is unaffected. `stripCwdEnv()` in `@archon/paths` (coleam00#1067) deletes Bun-auto-loaded `.env`/`.env.local`/ `.env.development`/`.env.production` keys from `process.env` at every Archon entry point before any subprocess is spawned. The native Claude binary does not auto-load `.env` from its cwd either. `--no-env-file` was belt-and-suspenders for the JS-via-Bun case only. Verified end-to-end with a sentinel: added a unique `ARCHON_LEAK_SENTINEL_$$` to Archon's `.env`, ran e2e-claude-smoke with a bash probe checking the subprocess env. stderr shows `[archon] stripped 23 keys from /Users/rasmus/Projects/cole/Archon (.env, .env.local)` — sentinel was deleted. Bash node prints `PASS: simple='4', no sentinel leak`. Workflow completes cleanly, no `--no-env-file` rejection from the SDK binary. bun run validate: green across all 10 packages. * fix(claude): address review on coleam00#1461 (stale docs + test gaps) Critical: file-level JSDoc at provider.ts:18 still claimed dev mode resolves cli.js. Updated to reflect SDK 0.2.x's switch to per-platform native binaries. Important: security.md still listed --no-env-file as item 2 of target-repo .env isolation. Scoped that bullet to legacy Bun-runnable JS entry points and called out that native binaries don't auto-load .env from cwd. Added an Unreleased Fixed entry to CHANGELOG.md. Updated binary-resolver.ts JSDoc title that referenced cli.js. Polish: widened the predicate to accept .mjs and .cjs (also Bun-runnable JS — matches the SDK's own internal extension list). Dropped the redundant `passesNoEnvFile` log field that mirrored `isJsExecutable`. Added unit cases for .mjs/.cjs (now true) and .ts/.tsx/.jsx (deliberately false — never SDK entry points). Added an integration test that mocks resolveClaudeBinaryPath to return a .js path and asserts executableArgs: ['--no-env-file'] flows through buildBaseClaudeOptions all the way to the SDK call — catches future regressions in the conditional spread. bun run validate: green across all 10 packages. * fix(orchestrator): clear stale session ID on error_during_execution to prevent infinite failure loop (coleam00#1294) * fix(orchestrator): clear stale session ID on error_during_execution to prevent infinite failure loop When a Claude API session expires (e.g. after container restart), the orchestrator persists the new (failed) session ID from the error result, causing every subsequent message in that conversation to hit the same error — an infinite failure loop. Fix: on error_during_execution result, set assistant_session_id to NULL instead of persisting the failed session ID. The next message starts a fresh session with full context rebuilt from the DB. Conversation history is unaffected since it lives in remote_agent_messages, independent of the Claude session. Changes: - updateSession() and tryPersistSessionId() now accept string | null - Both handleStreamMode and handleBatchMode clear session ID on error_during_execution Fixes coleam00#1280 * test(orchestrator): add stale session clearing tests + address review feedback Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com> Signed-off-by: kagura-agent <kagura.agent.ai@gmail.com> --------- Signed-off-by: kagura-agent <kagura.agent.ai@gmail.com> Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com> * fix(claude): honor CLAUDE_BIN_PATH in dev mode for libc-mismatch hosts (coleam00#1481) * fix(claude): honor CLAUDE_BIN_PATH in dev mode for libc-mismatch hosts The Claude Agent SDK auto-resolves its bundled native binary in [linux-x64-musl, linux-x64] order. On glibc Linux hosts (Ubuntu/Debian/ Fedora), Bun installs both via optionalDependencies and the musl variant is picked first; its ELF interpreter (/lib/ld-musl-x86_64.so.1) does not exist on glibc, so spawn fails and the SDK reports a misleading "binary not found" — the file is on disk, the loader is not. The documented escape hatch CLAUDE_BIN_PATH was dead code in dev mode: the resolver early-returned undefined when BUNDLED_IS_BINARY=false before ever reading the env var. The only workaround was patching node_modules. Move the env-var block above the BUNDLED_IS_BINARY return. Config-file path stays binary-mode-only — it's per-repo, not per-machine; env is the right knob for libc mismatches. Behavior preserved: - env unset → unchanged (undefined in dev, autodetect/throw in binary) - env set + file exists → resolved (was binary-only; now also dev) - env set + file missing → clear error (was binary-only; now also dev) Closes coleam00#1474 * chore(claude): address CodeRabbit review on coleam00#1481 - CHANGELOG entry under [Unreleased] / Fixed describing the dev-mode CLAUDE_BIN_PATH escape hatch (previously ignored). Notes that config-file path remains binary-mode-only and that env-loading + target-repo .env isolation are unchanged downstream. - Empty-string test pinning that CLAUDE_BIN_PATH='' falls through to undefined rather than throwing — protects against a future predicate typo that would treat empty as "set". - One-line note in ai-assistants.md "Binary path configuration" section pointing dev-mode users at the env-var override for the glibc/musl mismatch case. Skipped from the review: - The other two docs-page rewrites (configuration.md / troubleshooting.md): the error message itself names CLAUDE_BIN_PATH, and coleam00#1474 documents the use case publicly. One mention in ai-assistants.md is enough for discovery. - Type-style consistency tweaks in the test file: pure bikeshed. * fix(deps): bump hono to ^4.12.16 and @hono/node-server to ^1.19.13 (closes coleam00#1484) (coleam00#1499) * fix(orchestrator): create ~/.archon/workspaces before AI provider spawn (coleam00#1529) * fix(orchestrator): create ~/.archon/workspaces before AI provider spawn On a fresh install, ~/.archon/workspaces doesn't exist yet. The orchestrator passes that path as cwd to the AI provider, which calls spawn() — which raises ENOENT. The error is then misclassified as "binary not found" in the friendly-error path, surfacing as an incorrect "Claude binary not found" message. Add ensureArchonWorkspacesPath() in @archon/paths that mkdir -p's the directory and returns the path. Use it at the orchestrator's spawn-cwd site so the directory is guaranteed to exist before spawn(). Other call sites of getArchonWorkspacesPath() (workflow discovery, path-prefix comparisons) only consume the path string and don't need the directory to exist; they keep using the pure getter. Closes coleam00#1528 * test(orchestrator): assert ensureArchonWorkspacesPath is called Capture the @archon/paths mock as a named variable and assert it was called in the syncWorkspace handleMessage path. Without this, the test suite passes even if orchestrator-agent.ts:824 reverts to the non-ensuring getArchonWorkspacesPath() variant — exactly the regression that surfaced as 'Claude Code native binary not found' in coleam00#1528. * docs(changelog): add Tier 1 batch 2 cherry-pick entry Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Signed-off-by: kagura-agent <kagura.agent.ai@gmail.com> Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com> Co-authored-by: DIY Smart Code <thomas@thirty3.de> Co-authored-by: Cocoon-Break <54054995+kuishou68@users.noreply.github.com> Co-authored-by: Kagura <kagura.agent.ai@gmail.com> Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com> Co-authored-by: Yasser <116118149+YrFnS@users.noreply.github.com> Co-authored-by: Truffle <truffleagent@gmail.com> Co-authored-by: cjnprospa <sirhcle.j23@gmail.com>
Regression from coleam00#1481 (honor CLAUDE_BIN_PATH in dev mode). The CI workflow set `CLAUDE_BIN_PATH: ~/.local/bin/claude` in YAML `env:` blocks; YAML does not expand `~`, so the literal string was passed to the resolver. Before coleam00#1481, dev mode silently ignored the env var and the SDK auto-resolved its bundled binary — so the broken value was harmless. After coleam00#1481, dev mode honors it, the file-existence check fails on the literal `~`, and the smoke job aborts with "CLAUDE_BIN_PATH is set ... but the file does not exist". Move the env-var assignment into the run-step shell where `$HOME` resolves. Both e2e-claude and e2e-mixed-providers jobs are affected.
Summary
archon workflow runfails immediately in dev mode (bun run dev:server). The Claude Agent SDK auto-resolves its bundled native binary in[linux-x64-musl, linux-x64]order, picks the musl variant first, but the musl ELF interpreter (/lib/ld-musl-x86_64.so.1) does not exist on glibc — spawn fails and the SDK reports a misleading "binary not found" error.CLAUDE_BIN_PATHis dead code in dev mode (early-returned before the env var is read), so the only workaround is patchingnode_modules.CLAUDE_BIN_PATHenv-var block inbinary-resolver.tsabove theBUNDLED_IS_BINARYearly return. Updatedbinary-resolver-dev.test.tsto reflect the new contract.assistants.claude.claudeBinaryPath) is intentionally still binary-mode-only — it's a per-repo setting, not a per-machine escape hatch. Autodetect and the throw-with-install-instructions paths also stay binary-mode-only. Binary-mode behavior is byte-identical to before.UX Journey
Before
After
Architecture Diagram
Before
After
Connection inventory:
process.env.CLAUDE_BIN_PATHLabel Snapshot
risk: lowsize: XScore(closest available; lives in@archon/providers)providers:claude-binary-resolverChange Metadata
bugcoreLinked Issue
Validation Evidence (required)
bun run validategates exit 0; both resolver test files pass (5 dev-mode + 9 binary-mode).Security Impact (required)
Yes, describe risk and mitigation: N/ACompatibility / Migration
CLAUDE_BIN_PATHis documented; semantics broaden only)Human Verification (required)
undefined(existing test)Side Effects / Blast Radius (required)
CLAUDE_BIN_PATHexported globally (e.g. for binary builds) and then runs dev mode will now have that path used instead of the SDK's bundled binary. This is the desired behavior (and matches what the env var name implies), but it is a behavior change for anyone who had the env var set without realizing dev mode was ignoring it. If the path is valid this is harmless; if the path is stale they'll see a clear error pointing at the path.Rollback Plan (required)
git revert bf8dabbe— single-commit revert restores prior behavior.unset CLAUDE_BIN_PATHto opt out without code changes.Risks and Mitigations
CLAUDE_BIN_PATHset in their shell may now hit the resolver-level error in dev mode where they previously didn't.Summary by CodeRabbit
Bug Fixes
Tests
Documentation
Chores